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Abstract 



We consider the statistical experiment given by a sample y(l), . . . , y(n) of a stationary 
Gaussian process with an unknown smooth spectral density /. Asymptotic equivalence, 
in the sense of Le Cam's deficiency A-distance, to two Gaussian experiments with simpler 
structure is established. The first one is given by independent zero mean Gaussians with 
variance approximately /(wj) where ioi is a uniform grid of points in (— 7r, 7t) (nonpara- 
metric Gaussian scale regression). This approximation is closely related to well-known 
asymptotic independence results for the periodogram and corresponding inference meth- 
ods. The second asymptotic equivalence is to a Gaussian white noise model where the 
drift function is the log-spectral density. This represents the step from a Gaussian scale 
model to a location model, and also has a counterpart in established inference meth- 
ods, i.e. log-periodogram regression. The problem of simple explicit equivalence maps 
(Markov kernels), allowing to directly carry over inference, appears in this context but is 
not solved here. 



1 Introduction and main results 

Estimation of the spectral density f(co), lo G [— vr,7r] of a stationary process is an im- 
portant and traditional problem of mathematical statistics. We observe a sample j/W = 
(y(l), • • • ,y(n))' from a real Gaussian stationary sequence y(t) with Ey(i) = and autoco- 
variance function j(h) = ~Eiy(t)y(t + h). Consider the spectral density, defined on [— tt,tt] 
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h=—oo 



where it is assumed that J2hL-oo 7 2 (^) < 00 • Let T n be the n x n Toeplitz covariance matrix 
associated with 7(-), i.e. the matrix with entries 

exp (i (A; — j)u) f{ui) do;, j,k = l,...,n. (1.2) 

Write r n (/) for the covariance matrix corresponding to spectral density / and note that 
y( n ) hag a multivariate normal distribution N n (0,T n (f)). Let S be a nonparametric set of 
spectral densities to be described below. We are interested in the approximation of the 
statistical experiment 

£ n = (N n (0,T n (f)),f eX) (1.3) 

in the sense of Le Cam's deficiency pseudodistance A(-, •); see the end of this section for a 
precise definition. The statistical interpretation of the Le Cam distance is as follows. For two 
experiments £ and T having the same parameter space, A(£,J-) < e implies that for any 
decision problem with loss bounded by 1 and any statistical procedure with the experiment 
£ there is a (randomized) procedure with T the risk of which evaluated in T nearly matches 
(within e) the risk of the original procedure evaluated in £. In this statement the roles of £ 
and T can also be reversed. Two sequences £ n ^n are said to be asymptotically equivalent if 
A(£ n ,F n )^0. 

As a guide to what can be expected, consider first the case where 1? G is a smooth 
parametric family of spectral densities. Assume that B is a real interval; under some regularity 
conditions, the model is well known to fulfill the standard LAN conditions with localization 
rate n -1 / 2 and normalized Fisher information at d 

iLX^ logMuj) ) duj 

(Davies (1973), Dzhaparidze (1985), chap. 1.3, cf. also the discussion in van der Vaart (1998), 
Example 7.17). Consider the parametric Gaussian white noise model where the signal is the 
log-spectral density: 

dZ^ = log/*(o;)dw + 27t 1/2 n- 1/2 dW UJ , lo G [-tt,tt] (1.4) 

and note that in the family (/#,$ G 0), this model has the same asymptotic Fisher informa- 
tion. This is in agreement with the LAN result for the spectral density model, but it suggests 
that the above white noise approximation might also be true for larger (i.e. nonparametric) 
spectral density classes S. 

As a second piece of evidence for the white noise approximation in the nonparametric case we 
take known results about the approximate spectral decomposition of the Toeplitz covariance 
matrix T n (f). It is a classical difficulty in time series analysis that the exact eigenvalues and 
eigenvectors of T n (/) cannot easily be found and used for inference about /; in particular, the 
eigenvectors depend on /. However for an approximation which is a circulant matrix (denoted 
T n (/) below), the eigenvectors are independent of / and the eigenvalues are approximately 
f(ujj) where Uj are the points of an equispaced grid of size n in [—it, tt]. If the approximation 
by f n (/) were justified, one could apply an orthogonal transformation to the data and 
obtain a Gaussian scale model 

z j = f 1 ' 2 (<jj j )Z j ,j = l,...,n (1.5) 
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where £j are independent standard normal. For this model, nonparametric asymptotic equiv- 
alence theory was developed in Grama and Nussbaum (1998). Results there, for certain 
smoothness classes /€E, with / bounded away from 0, lead to the nonparametric version 
of the white noise model (|1.4p 

dZ u = log f(u)du + 27r 1 / 2 n- 1 / 2 dW u , u e [-7r,7r], / € E. (1.6) 

Our proof of asymptotic equivalence will in fact be based on the approximation of the covari- 
ance matrix T n (f) by the circulant T n (f), cf. Brockwell and Davis (1991), § 4.5. However 
we shall see that this tool does not enable a staightforward approximation of the data 
in total variation or Hellinger distance. Therefore our argument for asymptotic equivalence 
will be somewhat indirect, involving "bracketing" of the experiment £ n by upper and lower 
bounds (in the sense of informativity) and also a preliminary localization of the parameter 
space. 

To formulate our main result, define a parameter space £ of spectral densities as follows. For 
M > 0, define a set of real valued even functions on [—tt, tt] 

T m = {/ : M- 1 < /H, f(uj) = f(-tj),u E [-tt.tt]} . 

Thus our spectral densities are assumed uniformly bounded away from 0. Let L2(—tt,tt) be 
the usual (real) L2-space on [— 7r,7r]; for any / E L2(—tt,tt), let 7/(&0> k £ 7, be the Fourier 
coefficients according to (jl.lj) . For any a > and M > let 

W a (M) = j/ E L 2 (-tt,tt) : 7 f(0) + f] |Af a 7 f (A;) < M j . (1.7) 

These sets correspond to balls in the periodic fractional Sobolev scale with smoothness coeffi- 
cient a. Note that for a > 1/2, by an embedding theorem (LemmaEH Appendix), functions 
in W a (M) are also uniformly bounded. Define an a priori set for given a > 0, M > 

z aM = w a (M)nF M . 

Consider also a Gaussian scale model (|1.5p where the values f(u>j) are replaced by local 
averages 

rj/n 

Jj,n (f) = n f(2irx - n)dx, j = 1, . . . , n 

J(J-l)/n 

Theorem 1.1 Let S be a set of spectral densities contained in ^ a> M for some M > and 
a > 1/2. Then the experiments given by observations 

y(l), . . . ,y{n), a stationary centered Gaussian sequence with spectral density f 
z±, . . . , z n , where Zj are independent N(0, Jj >n (/)) 

with f £ £ are asymptotically equivalent. 

Let ||'||_ga be the Besov norm on the interval [—tt, tt] with smoothness index a (see Appendix, 
Section l5.3p . For the second main result we impose a smoothness condition involving this 
norm for the a > 1/2 from above and p = q = 6. 
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Theorem 1.2 Let S 6e a set of spectral densities as in Theorem ( fi. ij) . fulfilling additionally 
II /II do < M /or all f E~S. Then the experiments given respectively by observations 

6,6 

Z\, . . . , z n , where Zj are independent N(0, J^ n (/)) 
dZ u = log + 2n 1/2 n~ 1 / 2 dW bJ , UJ G [— 71", 7r] 

/ G S are asymptotically equivalent. 

The proof of this result is in the thesis Zhou (2004). The present paper is devoted to the 
proof of Theorem 11.11 

In nonparametric asymptotic equivalence theory, some constructive results have recently 
been obtained, i.e. explicit equivalence maps have been exhibited which allow to carry over 
optimal decision function from one sequence of experiments to the other. Brown and Low 
(1996) and Brown, Low and Zhang (2002) obtained constructive results for white noise with 
drift and Gaussian regression with nonrandom and random design. Brown, Carter, Low 
and Zhang (2004) found such equivalence maps (Markov kernels) for the i.i.d. model on 
the unit interval (density estimation) and the model of Gaussian white noise with drift; cf. 
also Carter (2002). The theoretical (nonconstructive) variant of this result had earlier been 
established in Nussbaum (1996), in the sense of an existence proof for pertaining Markov 
kernels. This indirect approach relied on the well known connection to likelihood processes 
of experiments, cf. Le Cam and Yang (2000). In the present paper, the result of Theorem ll.il 
are of nonconstructive type, using a variety of methods for bounding the A-distance between 
the time series experiment and the model of independent zero mean Gaussians. Similarly, 
the proof of Theorem 11.21 in Zhou (2004) is nonconstructive, but it appears likely in that 
a second step, relatively simple "workable" equivalence maps can be found, at least for the 
case of Theorem 11.11 related to the classical result about asymptotic independence of discrete 
Fourier transforms. 

To further discuss the context of the main results, we note the following points. 
1. Asymptotic independence of discrete Fourier transforms. Let 

n 

d n (io) = ex P (—ikuj) y(k),u> G (— 7T, it) 
k=l 

be the discrete Fourier transform of the time series y(l), ■ ■ ■ ,y(n). Assume n is uneven and 
let r/j be complex standard normal variables. It is well known that for the Fourier frequencies 
ujj = 2nj/n, j = 1, . . . , (n — l)/2 in (0, ir), there is an asymptotic distribution 

(7rn)~ 1/2 d n (uj) ?s exp(iw j )/ 1/2 (w j )r/ j 

and the values are asymptotically uncorrelated for distinct ujj, uJk- For a precise formulation 
cf. relation (|2.12|) below or Brockwell and Davis (1991), Proposition 4.5.2. This fact is the 
basis for many inference methods (e.g. Dahlhaus and Janas (1996)); see Lahiri (2003) for 
an extended discussion of the asymptotic independence. A linear transformation to n — 1 
independent real normals and adding a real normal according to (2irn)~ 1 ^ 2 d n (0) m N(0, /(0)) 
suggests the Gaussian scale model (|1.5|) . 
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2. Log-periodogram regression. Consider also the periodogram 

1 2 
In(u) = - — \d n (u)\ . 
Inn 

Note the equality in distribution |r/j| 2 ~ x| ~ ^ e ji where is standard exponential. As a 
consequence of the above result about d n (uj), we have for j = 1, . . . , (n — l)/2 

In{uj) k, f(uj)ej (1.8) 

with asymptotic independence. Assuming this model exact and taking a logarithm gives rise 
to the inference method of log-periodogram regression (for an account cf. Fan and Gijbels 
(1996), sec. 6.4) 

3. The Whittle approximation. This is an approximation to —n~ l times the log-likelihood 
of the time series y(l), . . . , y{n). In a parametric model f$, i)e0, with multivariate normal 
law N n (0, T n (/^)), computation of the MLE involves inverting the covariance matrix T n ( f^), 
which is difficult since both eigenvectors and eigenvalues depend on $ in general. Replacing 
r~ 1 (/^) by r„(l/47r 2 /^) and using an approximation to n _1 logT n (f^) leads to an expression 
L w (/) + log2-7r where 

is the Whittle likelihood (cf. Dahlhaus (1988) for a brief exposition and references). A 
closely related expression is obtained by assuming the model (jl.8p exact: then —n~ l times 
the log-likelihood is 

i. e. a discrete approximation to (jl.9p . For applications of the Whittle likelihood to non- 
parametric inference cf. Dahlhaus and Polonik (2002). 

4. Asymptotics for L w (/). The accuracy of the Whittle approximation has been described 
as follows (Coursol and Dacunha-Castelle (1982), Dzhaparidze (1986), Theorem 1, p. 52) . 
Let L n {f) be the log-likelihood in the experiment (|1.3|) : then 

L n (f) = -nL w {f) - n log 2vr + O p (1) (1.10) 

uniformly over / G £i/2,Af- This justifies use of L w (f) as a contrast function, e.g. it yields 
asymptotic efficiency of the Whittle MLE in parametric models (Dzhaparidze (1986), Chap. 
II), but falls short of providing asymptotic equivalence in the Le Cam sense. Indeed if (jLlOD 
were true with op(l) in place of Op(l) and with L w (f) replaced by L^(/) then this would 
already imply total variation equivalence, up to an orthogonal transform, of the exact model 
(jl.8p with / G ^1/2, m ( y i a the Scheffe lemma argument of Delattre and Hoffmann (2002)). 
In section 2 below (cf. relation (I2.18P ) we note a corresponding negative result, essentially 
that this total variation approximation over / G ^\/2,m does not take place. 

5. Conditions for Theorem For a narrower parameter space, i. e. a Holder ball 
with smoothness index a > 1/2, the result of Theorem 11.21 has been proved by Grama 
and Nussbaum (1998). Note that the Sobolev balls W a (M) figuring in Theorem 11.11 are 



5 



natural parameter sets of spectral densities since the smoothness condition is directly stated 
in terms of the autocovariance function 7/(-)- The Besov balls Bp p (M) given in terms of 
the norm ||-|| Ra are intermediate between L2-Sobolev and Holder balls. For the white noise 
approximation of the i.i.d. (density estimation) model, Brown, Carter, Low and Zhang (2004) 
succeeded in weakening the Holder ball condition in Nussbaum (1996) to a condition that £ 

1/2 1/2 

is compact both in the Besov spaces B 2 2 and S 4 4 on the unit interval. This is immediately 
implied by E C -B4 4 (M) for some a > 1/2. Our condition for Theorem 1 1.2 1 is slightly stronger, 
i.e. £ C Bqq(M) for some a > 1/2. In Remark 15.81 (Appendix) we note a sufficient condition 
in terms of the autocovariance function 7/(-)> i.e. give a description of the periodic version 
of the Besov ball. 

Throughout this paper we adopt the notation that C represents a constant independent of 
n and the parameter (spectral density) / G £, and the value of which may change at each 
occurrence, even on the same line. 



Relations between experiments. All measurable sample spaces are assumed to be 
Polish (complete separable) metric spaces equipped with their Borel sigma algebra. For 
measures P, Q on the same sample space, let \\P — QIItv be the total variation distance. For 
the general case where P, Q are not necessarily on the same sample space, suppose K is a 
Markov kernel such that KP is a measure on the same sample space as Q. In that case, 
\\Q-KP\\ TV is defined and will be used as generic notation for a Markov kernel K. 

Consider now experiments (families of measures) P = (Qf, f £ £) and £ = (Pf, f £ E), 
with the same parameter space E. All experiments here are assumed dominated by a sigma- 
finite measure on their respective sample space. If £ and J- are on the same sample space, 
define their total variation distance 

A {£ , T) = sup \\Q f - Pf\\ TV • 

In the general case, the deficiency of £ with respect to T is defined as 

5 (£,F) = inf sup \\Qf — KPf\\ TV 

where inf extends over all appropriate Markov kernels. Le Cam's pseudodistance A(-,-) 
between £ and T then is 

A (£, T) = max (5 (£, T) ,5(F,£)). 

Furthermore, we will use the following notation involving experiments £, J- or sequences of 
such £ n = (P n j, f € E) and T n = (Q n ,f, f G E). 

Notation. 



£ 




T 


(P more informative than £ ): 


8{T,£) = 


£ 




T 


(equivalent): 


A(£,P) = 


£,, 


CO 


•P n 


(asymptotically total variation equivalent): 


A (P n ,£n) ^0 


f 


-< 


•P n 


[T n asymptotically more informative than £ n ): 


5(P n ,£ n )^0 


f 




■P n 


(asymptotically equivalent): 


A(P n ,£ n )^0 
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Note that "more informative" above is used in the sense of a semi-ordering, i.e. its actual 
meaning is "at least as informative". We shall also write the relation ~ in a less formal way 
between data vectors such as ~ y^ n \ if it is clear from the context which experiments 
the data vectors represent. 



2 The periodic Gaussian experiment 



From now on we shall assume that n is uneven. Our argument for asymptotic equivalence is 
such that it easily allows extension to the case of general sequences n — > oo (cf . Remark 14.101 
for details). 

Recall that the covariance matrix T n = T n (f) has the Toeplitz form (T n )j^ = ^{k — j), 
j,k = 1,... ,n, i.e. 



/ 7(0) 
7(1) 



7(1) 
7 (0) 



7 (n - 2) 
\ 7 («-l) 7(n-2) 



7 ( n _ 2 ) 7 (n-l) \ 
7 (n - 2) 



7(0) 
7(1) 



7(1) 
7(0) / 



Following Brockwell and Davis (1991), § 4.5 we shall define a circulant matrix approximation 
by 

/ 7 (0) 7(1) ••• 7(2) 7(1) \ 



7(1) 7(0) 

7(2) ... 
V 7(1) 7(2) 



... 7 (2) 

7 (0) 7 (1) 
7(1) 7(0) / 



where in the first row, the central element and the one following it coincide with j((n — l)/2). 
More precisely, for given uneven n define a function on integers h with \h\ < n 

z f M -I V( h )> \h\<(n-l)/2 

7(n)J W | 7/ ( n _ ( n + x y 2 <\h\<n-l 



and set 



(fn)j,fc(/) = \n)j(k - j),j,k = 1,. . . , 



(2.1) 



We shall also write T n (f) for the corresponding nx n matrix, or simply T n and 7( n )(/i) if the 
dependence on / is understood. Define 



Wj = — , \j\ < (n-l)/2. 
n 



(2.2) 



It is well known (see Brockwell and Davis (1991), relation 4.5.5) that the spectral decompo- 
sition of T n can be described as follows. We have 



|j|<(n-l)/2 



(2.3) 
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where Xj are real eigenvalues and Uj are real orthonormal eigenvectors. The eigenvalues are 
*i = Yl ^ exp(-i^/c), |j| < (n - l)/2. 

|fc|<(n-l)/2 

Note that Xj = A_j, j ^ and that the Aj are approximate values of 27r/ in the points Wj. 
Indeed define 

/nM = ^ ^ 7(fc)exp(ifcw), w€ [-tt.it] (2.4) 

fc|<(n-l)/2 

a truncated Fourier series approximation to /; then f n is an even function on [—ir, n] and 

Xj = 2 7 r/ n (o; i ), |j| < (n-l)/2. (2.5) 

The eigenvectors are 



u' = n 



" 1/2 (1,...,1), (2.6) 

= (2/n) 1 / 2 (1, cos(wj), cos(2a;j) . . . , cos((n — , (2-7) 

u'_j = (2/n) 1/2 (0,sm(w i ),sin(2a; i ) . . . ,sin((n - l)uj)) , j = 1, . . . , (n - l)/2. (2.8) 

In our setting, the circulant matrix T n is positive definite for n large enough. Indeed, Lemma 

15.61 Appendix implies that f n > M /2 uniformly over / G S, for n large enough, so that 
r n (/) is a covariance matrix. Define the experiment, in analogy to (II. 3|) . 

4= f^«(0,f «(/)),/€ S) (2.9) 



with data y^ n \ say. The sequence y^ may be called a "periodic process" since it can be 
represented in terms of independent standard Gaussians £j, as a finite sum 



y 



b1<(n-l)/2 



where the vector Uj describes a deterministic oscillation (cp. (|2.6p - (|2.8|) ). Accordingly £ n 
will be called a periodic Gaussian experiment. 

The periodic process y( n ' is known to approximate the original time series y^ in the following 
sense. Define the n x n-matrix 

U n = (u_( n _ 1 )/ 2 , • • • , U( n „!)/ 2 ) (2-11) 

and consider the transforms 

z (») = [2-k)- x I 2 U'J^\ = {2it)- l ' 2 U' n y^ . 

Denote Cov(z (n) ) the covariance matrix of the random vector z^ n \ Then we have (Brockwell 
and Davis (1991), Proposition 4.5.2), for given / £ S 



sup 



Cov[zW)ij - Cav(zW)ij -^Oasn^co. (2.12) 
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Since Cov(5^ n ^) is diagonal with diagonal elements \j/2ir, this means that the elements of 
z (n) are approximately uncorrelated for large n. 

Note that zS n ^ can also be written, in accordance with (|2.10p and (|2.5p 

= (f^icjj)^) (2.13) 
V" ^ 3> ^ J \j\<(n-l)/2 V ' 

which is nearly identical with the Gaussian scale model (|1.5|) . Thus the question appears 
whether the approximation (I2.12p can be strengthened to a total variation approximation of 
the respective laws C (z^\f) and C (z^\f). 

The answer to that is negative; let us introduce some notation. For nxn matrices A = (ajk) 
define the Euclidean norm \\A\\ by 

\\A\\ 2 :=tr[A'A}=±±a%. 

j=i k=i 

If A is symmetric, we denote the largest and smallest eigenvalues by A max (^4), A m in (A). For 
later use, we also define the operator norm of (not necessarily symmetric) A by 

\A\ := (A max (A'A)) 1/2 . 

If A is symmetric nonnegative definite then \A\ = A max (^4). The following lemma shows 
that the Hellinger distance between the laws of y^ and y^ depends crucially on the total 

between the covariance matrices, so that an elementwise 



Euclidean distance 
convergence as in ([2 



r n (/)-r n (/) 

. 12|) is not enough. 



Lemma 2.1 Let A,B be n x n covariance matrices and suppose that for some M > 1 

< M~ x < X m in(A) and A max (^) < M. 

Then there exist e = e^f > and K = Km > 1 not depending on A,B and n such that 
\\A — B\\ < e implies 

R- 1 \\A-B\\ 2 < H 2 (N n (0, A), N n (0, B)) <K \\A- B\\ 2 . 

where H{-, •) is the Hellinger distance. 

The proof is in section[5j To apply this lemma, set A = T n (f), B = T n (f) and note that, since 
/ G S is bounded and bounded away from (both uniformly over / £ E), the condition on 
the eigenvalues of T n (/) is fulfilled, also uniformly over / G E (Brockwell and Davis (1991), 

2 



Proposition 4.5.3). We shall see that the expression 



r„(/) - r n (/) 



is closely related to a 



Sobolev type seminorm for smoothness index 1/2. For any / £ L<i{— vr, 7r) given by (jl.ip set 



E N 2Q 7fW, 11/11^:= 7?(0) + |/lia (2-14) 



-co 



provided the right side is finite; the Sobolev ball W a (M) given by (|1.7p is then described by 
I2 a < M. Also, for any natural m define a finite dimensional linear subspace of Li2{— vr, tt) 

L m = If S L2(—tt,tt) : J f(u])exp(ikuj)du = 0, |fc| > m| . 
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Lemma 2.2 (i) For any f G S we have 



r„(/)-f n (/) <2\f\ 2 



1/2 



and for f G £ n L {n _ 1)/2 



l/l 



2,1/2 



r„(/)-f n (/) 



(mJ For any /, /q € S we /iaue 



r n (/) - r n (/ ) - f n (/) - f n (/ ) < 2 1/ - / 



1 2,1/2 



(2.15) 



(2.16) 



Proof, (i) From the definition of T n (/) and T n (f) m terms of 7(-))7(n)(") we immediately 
obtain 

"rn(/)-f n (/)|| 2 = Yl (n-\k\){i(k)-%)(k)) 2 

|fc|<n~l 



n-l (n-l)/2 
= E (n-|fc|)(7W-7(n-|A:|)) 2 = 2 £ fc ( 7 (fe) - 7 (n - A:)) 2 (2.17) 
|fc|=(n+l)/2 fe=l 
(n-l)/2 n-l 

<2 2k{ 1 \k)+ 1 \n-k))<AY J k 1 \k)<2\f\l l/2 . 

k=l k=l 

The first inequality is proved. The second one follows immediately from (|2.17p . 

(ii) Note that for any n, the mapping / — ► T n (f) if it is defined by (|1 .2f) for any / G L 2 (— vr, 7r) 
is linear, and the same is true for / — > T n (f) defined by (|2.ip . Hence 

r„(/) - r n (/ ) = r n (f - /„), f»(/) - f n (/ ) = f n (/ - /o). 

Now the argument is completely analogous to (i) if 'y(k) = jf(k) is replaced by 7/-/ (&)- ■ 

Our assumption / 6 S, i.e. ||/||2 Q < for some a > 1/2 provides an upper bound M for 
I f\ 2 1/2 but does guarantee that this term is uniformly small. Thus we are not able to utilize 
Lemma 12.11 to approximate £ n by £ n in Hellinger distance. In fact this Hellinger distance 
approximation does not take place: take a fixed m, select / G £ Pi L m such that 



with e from Lemma 12.11 and use the lower bound in this lemma to show that 

H 2 (N n (0,r n (f)),N n (0,f n (f)j) > K-'e 2 



2,1/2 



< e 



(2.18) 

for all sufficiently large n. Thus the direct approximation of the time series data y^ by the 
periodic process y^ in total variation distance fails. 

However that does not contradict asymptotic equivalence since the latter allows for a random- 
ization mapping (Markov kernel) applied to y^ and y^ n \ respectively, before total variation 
distance of the laws is taken. We will show the existence of appropriate Markov kernels in 
an indirect way, via a bracketing of the original time series experiment by upper and lower 
bounds in the sense of informativity. 
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Let now £ n again be the time series experiment (|1 .3H : we shall find an asymptotic bracketing, 
i.e. two sequences Si >n , £ u ,n such that 

and such that both £\ n and £ u>n are asymptotically equivalent to £ n given by (|2.9|) . and to 
£ n representing the independent Gaussians z\, . . . , z n in Theorem ll.il 

3 Upper informativity bracket 

The spectral representation (|2.10p of the periodic sequence y^ = (y(l), . . . ,y(n))' can be 
written 

(n-l)/2 

y(t) = (27r/n) 1 /2/i/2 (0)eo + 2(7r/n) i/2 £ fl/2 (ujj) cos{(t _ 
1 

+ 2(vr/n) 1 / 2 ^ /y 2 (^)sin((t-l)^)C i ,t = l,...,n. (3.1) 

j=-(n-l)/2 

We saw that here y^ is a one-to-one function y( n ) = Uz^ of the n-vector of independent 
Gaussians z^ (cf. (|2.13j) ). but the approximation of y^ to j/ n ) is not in the total variation 
sense (cf. (I2.18P ). Now take a limit in (13. ip for n —* oo and fixed i and observe that 
(heuristically) this yields the spectral representation of the original stationary sequence y(t) 

y (t + l)= V2f 1 ' 2 (uj)cos(tuj)dB u + V2f 1 / 2 (uj)sm(tuj)dB UJ ,t = 0,l,... (3.2) 

J[0,ir] J[-tt,0] 

where dB^ is standard Gaussian white noise on [— 7r,7r] (cf. Brockwell and Davis (1991), 
Probl. 4.31). Here for any n, the vector y^ = (y(l), ■ ■ ■ , y{n))' is represented as a functional 
of the continuous time process 

dZ*=f 1 / 2 {u)dB u! ,ue [-7T,7T]. 

Thus a completely observed process Z*, uj £ [— tt,tt] would represent an upper informativity 
bracket for any sample size n, but this experiment is statistically trivial since the observation 
here identifies the parameter /. 

Our approach now is to construct an intermediate series y( m,n ) of size n in which the uniform 
size n grid of points Uj, \ j\ < (n — l)/2 is replaced by a finer uniform grid of m > n points 
in the representation (|3.1|) , Thus 

y(n,m) ig a 

functional not of n independent Gaussians but 
of m > n of these; call their vector z^ . The random vector z^ now represents an upper 
informativity bracket which remains nontrivial (asymptotically) if m — n — > oo not too quickly. 
An equivalent description of that idea is as follows. Consider m> n and the periodic process 
given by (|2.10p where the original sample size n is replaced by m. Then define y( n,m ) 
as the vector of the first n components of y( m \ The law of y( n,m ) is N n (0, T n ,m(f)) where 
r n ,m(/) is the upper left n x n submatrix of F m (f). 
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We now easily observe the improved approximation quality of y(™' m ) for . Assume that m 
is also uneven. First note that for (to + l)/2 > n we already obtain r n ,m(/) = r n (/). This 
follows immediately from the definition of the circular matrix T m (f) via the autocovariance 
function 7( m )(-). However we would like to limit the increase of sample size, i.e. require 
m/n — > 1; therefore, in what follows we assume m < 2n — 1. 



Lemma 3.1 Assume m is uneven, n < m < 2n — 1. T/ien /or any f £ Y> we have 

r„(/) - f n , m (/)| 2 < 4 (m - n + I) 1 " 2 " l/l 2 , a , 

and hence if m = m n is such that to — n ^ oo as n ^ oo t/ien 

supF 2 (iV„(0,r n (/)),AT n (0,f n , m (/))) -» 0. 
/es v / 



(3.3) 



Proof. From the definition of T n (f) and f n ,m(/) we immediately obtain 



r„(/)-f n , m (/) = J] (n-|fc|)(7(fc)-7(m)(fc)) : 

|fc|<n-l 



n-1 



n-1 



= 2 ^ (n - k) (j{k) - j(m - k)) 2 < A ^ (n - it) ( 7 2 (A;) + 7 2 (m - fc)) . 
fc=(m+l)/2 fc=(m+l)/2 

Now note that for m > n, the relation (m + l)/2</c<n — 1 implies > (n + l)/2 and 
therefore n — k < k, and note also n — k < to — k. We obtain an upper bound 

n— 1 n— 1 

<4 V A;7 2 (/c)+4 V (m - k)j 2 (m - k) 

fc=(m+l)/2 fc=(m+l)/2 

n-1 (m-l)/2 n-1 

= 4 ^ A; 7 2 (A:)+4 ^ A: 7 2 (A;) = 4 ^ fc 7 2 (A:) 

A:=(m+l)/2 k=m— n+1 k=m—n+l 

n-1 

< 4(m - n + l) 1 " 2 " k 2a j 2 (k) <4(m-n+l) 1 - 2Q |/| 



2 

2. a 



k=m— n+1 



where a > 1/2. This proves the first relation. For the second, recall that |/| 2q < -Af for 
/ G £ and invoke Lemma 12.11 together with the subsequent remark on the eigenvalues of 

r„(/). ■ 

Define the experiment 

£n,m= (iV n (0,f n , ro (/)),/GS) 
then (13. 3p implies £ n ~ £ n ,m if to — n — > oo. Moreover, we have £ n . m if? £ m by definition, thus 

f < f 
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in case m — n — > oo. We know that £ m is equivalent (via the linear transformation (2tt) 1//2 C/') 
to observing data zS n ^ given by (|2.13j) . Define £ n by 



£ n = (AT n (0,r n (/)),/e£j (3.4) 

where 

tM=Diag(J j , n (f)) j =i,..., n - 

Note that the data 21, . . . ,z n in Theorem ll.il are represented by £ n . We shall also write 
for their vector, so that £(z (n) |/) = N n (0, f „(/)). 

Proposition 3.2 PFe /tat>e £ n ~ £ n , with corresponding equivalence maps (Markov kernels) 
as follows. Let and z^ be data in £ n and £ n respectively. Then, for the orthogonal 
matrix U n given by \2.11\) 

(2tt)- 1 / 2 U^ ti) ~ |W, and (2vr) 1 / 2 ?7„|(™) ~ jf( n >. 
Proof. Note that our first claim can also be written z^ ~ where 

i s from (|2TT3j) . To 

describe £(z( n )|_f), define <5j = f n (^j~( n +i)/2) f° r J = 1> • • • > n an d a n x n covariance matrix 

A n (/) = Dia(/(<5 i ) i=1 ,... in . 

Then £(z^|/) = N n (0, A n (/)). The conditions on / (see also Lemma I5U1 Appendix) imply 
that uniformly over j = 1, . . . , n 

■/;,»(/) > C^ 1 , J ijn (/) < C 
for some C > not depending on / and n. Now apply Lemma 12. II to obtain 

2 



# 2 (iV„(0,f „(/)), iV n (0,A n (/))) <C ||f n (/)-A„(/)| =C £ (J;,„(/) - * 



,2 



By Lemma 15.71 this is o(l) uniformly in /. This implies the first relation ~. The second 
relation is an obvious consequence. ■ 

For a choice m = n + r n , r n = 2 [log(n/2)] we immediately obtain the following result. Define 
the upper bracket Gaussian scale experiment £ U;Tl by 

£u,n := £n+r n - (^-5) 

Corollary 3.3 Consider experiments £ n and £ u>n given respectively by hl.3\) and 113. 5\) . [3.$ 

with parameter space S = E^M where M > 0, a > 1/2. Then as n —* 00 
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4 Lower informativity bracket 



The upper bound H2.15H for the Hellinger distance of y^ and the periodic process y^ 
which does not tend to 0, can be improved in a certain sense if / is restricted to a shrinking 
neighborhod, S n (/o) sa y> of some /o G S. At this stage, fo is assumed known so the covariance 
matrices T n (f) and f n (f) can be used for a linear transformation of ?/ n ) which brings it closer 
to the periodic process y*- n ^. The linear transformation of which depends on /o can be 
construed as a Markov kernel mapping which yields asymptotic equivalence £ n (fo) ~ £n(fo) 
if these are the versions of £ n and £ n with / restricted to / G X n (/o). 

Such a local asymptotic equivalence can be globalized in a standard way (cf. Nussbaum 
(1996), Grama and Nussbaum (1998)) if sample splitting were available in both global exper- 
iments £ n and £ n . For the original stationary process that would mean that observing a series 
of size n is equivalent to observing two independent series of size approximately n/2. We will 
establish an asymptotic version of sample splitting for y^ which involves omitting a fraction 
of the sample in the center of the series, i.e. omitting terms with index near n/2. The ensu- 
ing loss of information means that the globalization procedure only yields a lower asymptotic 
informativity bracket for £ n , i.e. a sequence £~f n such that £~f n ^ £ n . The experiment £f n 
will be made up of two independent periodic processes with the same parameter / and with 
a sample size m ~ (n — log n)/2. Each of these is equivalent to a Gaussian scale model (12.130 
with n replaced by m ; further arguments show that observing these two is asymptotically 
equivalent to a Gaussian scale model £i )Tl := £2m with grid size 2m ~ n — logn. 

A crucial step now consists in showing that in the Gaussian scale models £ n , the grid size n 
can be replaced by n — logn or n + logn. This step is an analog, for the special regression 
model, of the well known reasoning in the i.i.d. case that additional observations may be 
asymptotically negligible (cf. Mammen (1986) for parametric i.i.d. models, Low and Zhou 
(2004) for the nonparametric case). Thus it follows that the lower and upper bracketing 
experiments £^ n , £ u ,n are both asymptotically equivalent to £ n , and the relations 

£l,n ^ £n ^ £u,n 

then imply £ n ~ £ n , i.e. Theorem II .11 
4.1 Local experiments 

Let x n be a sequence x n \ 0, fixed in the sequel. A specific choice of x n will be made in 
section |4~H below (see (|4.12|) ). Let IHI^ be the sup-norm for real functions defined on [— tt,tt], 

ll/IL= sup 

u£[- 7T,7r] 

and for /o E S define shrinking neighborhoods 

£„(/o) = {/ € £ : ||/ - fo\L + ||/ - /o|| 2 ,i/ 2 < xn} • (4.1) 
The restricted experiments are 

£n(fo) = {N n (0, T n (/)), / G S n (/ )), 4(/o) = (A r n(0,f n (/)),/ G E n (/ )) • 
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For shortness write T = T n (f), Tq = T n (fo) and similarly f = T n (f), Tq = f n (/o)- Define a 
matrix 

K n = K n (fo) = rl /2 T 1/2 (4.2) 
and in experiment £ n (fo) consider transformed observations 

:=K n (fo)y (n) . 

Consider also the experiment £^(/o) given by the laws of y^ n \ i.e. 

S*(f ) = (N n (0,K n (fo)T n (f)K(f ))Je S n (/ )) . 
Clearly £ n (f ) ~ S*(f )) the next result proves that £*(fo) ^ £ n (fo) and thus £ n (fo) ~ £ n {fo)- 

Lemma 4.1 We have 



sup sup H 2 (JV n (0,X n (/o)r n (/X(/o),JV n (0,f n (/)) < C x n . 

/o€S/GS„(/o) 



Proof. In view of Lemma 12.11 it suflfices to show that 



sup ( A max (r n ) + A-[ n (r n ) ) < c 



and that 



Note that 



K T K' — f 



< C Xn. 



Amax(r) = max fn{uj) , A min (r) = min 

b'|<(n-l)/2 |j|<(n-l)/2 



and that Lemma 15.61 implies 



sup/ e s 



/ fn 



0. 



(4.3) 



Hence (14. 3p follows immediately from / £ E, more specifically the fact that values of / are 
uniformly bounded and bounded away from 0. According to Proposition 4.5.3 in Brockwell 
and Davis (1991), the assumption / G S also implies a corresponding property for T, i.e. 



sup (A max (r n ) + A"[ n (r n )) < c. 

Note that eigenvalues of To and To share property (|4.3I) since /o G S. 
Set G = r~ 1/2 rr~ 1/2 and G = f~ 1/2 ff~ 1/2 . Since 



(4.4) 



it now suffices to show that 



K V K' — V 



G-G 



< 



G-G 



< C K n . 



(4.5) 
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To establish (|4.5|) , denote A = T — Tq, A = f — f o and observe 



G-G 




r 


1/2-^-^ — 1/2 
1 1 


~ — 








l/2 Ar -l/2 _ 






< 


To 


1/2 (a - A) 


r 



l/2™-l/2 
1 1 



-Ar; 



+ 



r -l/2S r -l/2 p-l/27 r -l/2 
1 ^ 1 AAi 



(4.6) 



We shall now estimate the two terms on the right side separately. By elementary properties 
of eigenvalues we obtain 



-1/2 



a — A ) r 



-1/2 



— I 1 o I 



A- A 



where |r o 1 | < C and according to Lemma 12.21 (ii) 



A- A 



< 2 |/ - /oil. 



1/2- 



Furthermore 



-l/Jl r -l/2 
^ L 



-1/2 Xf-1/2 
^ 



-1/2 



< 2C 

< C 



-1/2 



Ar~ 1/2 + r~ Vi A (r 



V2, 



,-1/2 



-1/2 



A 



r -l/2 p-1/2 
1 1 



r l/2 p l/2 
1 1 



A 



p-1/2 /pl/2 pl/2\ p-1/2 
i I 1 1 ) 1 



Applying Lemma 15.11 and Lemma 12.21 (i) we obtain 



pl/2 fl/2 
1 1 



< C 



o — 1 



<C|/ | 



2,1/2 ■ 



Here I/0I2 1/2 — l/ol^a — Collecting these estimates yields 



G-G 



<C(l/-/ol2,l/2 + 



A 



To complete the proof, it suffices to note that, since V and To have the same set of eigenvectors 
(cf. ([23D and (jZSD - (|ZH| ) 



A 



A m ax(r-r ) = (2ir) max fn(wj) - fo,n(uj) 



< c 



fn — fi 



0,n 



j|<(n-l)/2 



< C 11/ - Mil + C n x ~ 2a logn ||/ - /oil 2 



where the last inequality is a consequence of Lemma [5. 61 Hence 
^5] 



< C>c„, which establishes 
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4.2 Sample splitting 



Consider sample splitting for a stationary process: Take the observed = (y(l), • • • ,y(n)) 
and omit r observations in the center of the series. Recall that n was assumed uneven; assume 
now also r to be uneven and set m = (n — r)/2, then the result is the series y(l), . . . , y(m), 
y{n — m + 1), . . . , y(n). The total covariance matrix for these reduced data is 



\ ^n,m r m (/) J 



where the mxm matrix A n ^ m = A n ^ m {f) contains only covariances jf(r + 1), 7/(r + 2) and 
of higher order. In fact A is the upper right mxm submatrix of r„(/), i.e. 

/ ... 7(n-2) 7(n-l) 
4,™= 7(^ + 2) ... 7(n - 2) 
\ 7(r + 1) 7(r + 2) 

In the sequel we set r n = 2[logn/2] + 1 and thus r n ~ logn, m = (n — r n ) /2. The corre- 
sponding experiment we denote 



£* n =(N 2m (o,v^(f)),fen) 



Consider also the experiment where two independent stationary series of length m are ob- 
served, yf m ^ and y^, say. The corresponding experiment is 



£t n := fiY 2m (0,rH(/)),/ G s) (4.7) 



where 



Clearly we have £q n < £ n . 



mxm 



dmxm -Tm(/) 



Proposition 4.2 £f n ~ £* n . 

Proof. Use Lemma 12. II to compute the Hellinger distance. Take A = ; then the eigen- 
values of ^4 are those of T m (f), so that (|4.4p can be invoked. The squared distance of the 
covariance matrices T^j and is 



2 71-1 
^ (m) =2||A nim || 2 <2 £ (fc-r) 7 2 (fc) 



1 n,0 1 n.l 



fc=r+l 

n-1 

\ -[ \l-2a | .p|2 

2,a 



<2 £ fc7 2 (£0<(r + l)^ 2a |/| 

fc=r+l 

Since r n — ► oo, the result follows. ■ 

We have shown that two independent stationary sequences of length m = (n — r n ) /2 are 
asymptotically less informative than one sequence of length n. Having obtained a method of 
sample splitting for stationary sequences (with some loss of information), we can now use a 
localization argument to complete the proof of the lower bound. 
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4.3 Preliminary estimators 



For the globalization procedure, we need existence of an estimator f n , in both of the global 
experiments £ n and £ n (or £ n ), such that /„ takes values in £ and 



fn f 



+ 



fn f 



2,1/2 



Op(l) 



uniformly over / £ E. More specifically, a rate o p (n n ) with K n from (|4.ip is needed in the 

above result, but ft n has not been selected so far, and will be determined based on the results 

of this section (cf. (I4.12|) below). Select j3 S (1/2, a) and consider the norm ||/|| 2/ g according 

n ii t n 

(4.8) 



to (f2~T4"|) . Note that UJ n 21/2 
therefore it suffices to show 



< C 



2 g and that according to Lemma \b. 
= o p (l). 

2,j3 F 



< c 



fn f 



For this, we shall use a standard truncated orthogonal series estimator and then modify it to 
take values in X. The empirical autocovariance function is 



n—k 



ln{k) 



^2vU)y( k + j), k = Q,...,n-l. 



n — k 



3=1 



We have unbiasedness: E^f n (k) = Jf(k); for the variance of J n (k) we have the following result. 



Lemma 4.3 For any spectral density f G L 2 (— 7r, it), and any k = 0, . . . , n — 1 



n-l 



Var%(k) < r^7/0')- 

n — k J 



j=0 



Proof. For given k, set m = n — k and z(j) = y(j)y(j + k) — 7/(&), j = 1, . . . , m. The z(j) 
form a zero mean stationary series, with autocovariance function p(j), say. We have 

(7TL \ TTL — 1 

-£>(*)) 4 E p(*-i) = -p(0) + 2-^X)(m-fc)p(*) 
fc=l / X<j,k<m k=l 

m—X 

< - E ( 4 - 9 ) 

k=0 

The computation in Shiryaev (1996), (VI. 4. 5-6) gives 

P{j) = 7 2 (j) + 7(j - k)j(j + fc). 

The inequality 

2 |7(J " kh(j + k)\< 7 2 (j — fc) + 7 2 (i + k) 

now implies 

TO— 1 _ 71—1 

fe=0 fc=0 
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(we bound the sum involving j 2 (j — k) by 2^™ = Q 7 2 (j)). In conjunction with (14. 9p this 
proves the lemma. ■ 



For the orthogonal series estimator, define a truncation index n = [77, 1 / ( 2a+1 )] and set 

f n (u) = ^2 %(k)exp(ikLo), lo G [-it, it}. 

\k\<h 



(4.10) 



Lemma 4.4 In the experiment £ n the estimator f n fulfills for any (3 S (1/2, a) and any 



76 



2a+l ) 



sup P 

/es 



fn f 



2.3 



> n 



0. 



(4.11) 



Proof. By the Markov inequality, it suffices to prove 



sup Ef 

/eS 



fn f 



2.3 



o(n 



A bias- variance decomposition and Lemma 14.31 yield 



fn f 



2.3 



max (l, \k\ 2 ^ Var%(k) + ^ l*f V (*0 

|A:|<ri |fc|>n 



' n-l 



< — 

n 



fc|<n '" " \j'=0 / |fc|>n 

C -"lY,™x(h\k\^)+n^\f\l a 

\k\<h 

<C\\f\\ln-W^ + C\f\l a n^ 
<c(||/||^ + |/|t)- 2(/3 - a)/(2cl+1) - 
Since Il/H 2 < C ||/|| 2 Q and \f\\ a < ||/|| 2 Q , the result follows. ■ 

We now turn to preliminary estimation in the periodic experiment £ n with data vector y^ n \ 
Note that this data vector can be construed as coming from a stationary sequence with 
autocoviance function 7( n )(-) given by (12. ip for \k\ < n — 1 and 7( n )(&) = for \k\ > n — 1, i.e. 
the stationary sequence having spectral density /„. Thus if 7 ra (&) again denotes the empirical 
autocoviance function in this series then we can apply Lemma 14.31 to obtain 



n-l 



Var%{k) < -—r^2l 2 n),fU), k = 0, . . . ,n - 1. 



3=0 



Obviously 



n-l 



. . (n-l)/2 (n-l)/2 

fc=0 fc=0 fc=l 
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Now use the estimator (j4. 10p with h as above; since h = o((n— 1)/2), we have the unbiasedness 

E%(k) = 7/(fc), k = 0,...,h. 

Thus the proof of the following result is entirely analogous to Lemma 14.41 the estimator /„ 
is also formally the same function of the data. 



Lemma 4.5 In the experiment E n the estimator f n fulfills 1 i| ) for any (3 E (1/2, a) and 

™v 7 e (o, . 

Finally consider modifications such that the estimator takes values in X^m- Consider the 
space W@ = |/ G L2(— 7r, 7r) : H/Hj ^ < oo|; this is a periodic fractional Sobolev space which 
is Hilbert under the norm 
exists a 

Definition 1.4.1 ). Then 



2 Q. There the set E^m is compact and convex; hence there 
2 ^-continuous) projection operator LT onto Sq, 5 m in W@ (cf. Balakrishnan (1976), 



n [fn )-f 



< 



2,/3 



fn f 



2,/3 



The modified estimators LT yfnj thus again fulfill (|4.1ip . A summary of results in this section 
is the following. 

Proposition 4.6 In both experiments £ n and £ n there are estimators f n taking values in E 
and fulfilling for any 7 G (0, ^"V^ 



supP 

/es 



/n / 



+ 



/n / 



2,1/2 



> n 



0. 



4.4 Globalization 



In this section we denote 

P f , n := £(y (n) |/) = N n (0,r n (/)) , P />n := £(y (n) |/) = (o,f n (/)) 



Consider again the experiment £f of (|4.7I where two independent stationary series yj mj and 
y 2 of length m = (n — r n ) /2 are observed. In modified notation we now write 

£*n =£m®£ m = (Pf, m ® Pf, m , / € E) . 

We shall compare this with the experiments 

^2^n := ® = (y D f,rri ® Pf,mi f G E 
^Ifn := ® £m = (Pf,m ® Pf,mi f £ E^ 

At this point select the shrinking rate K n of the neighborhoods E n (/o) (cp. (14. ip ) as 

a - 1/2 



(m) 



K n = n * , 7 



2(2a + 1) 



(4.12) 
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Proposition 4.7 We have £f n ^ £f n . 



Proof. We shall construct a sequence of Markov kernels M n such that 



0. 



sup H z P fm ® P fm , M n (P fm ® P f , t 



Define M n as follows: given and IJ2 ■> an d A a measurable subset of M? m , set 

M n (aJ^J^) = U (yi m) ,^(/ m (yi m) ))^ m) 

where K m (f) is the matrix defined by (14. 2p . i.e. for / £ £ by 

K m {f) = rU 2 (f)r m 1/2 (f) 

and / m is the estimator in £ m of Proposition 14.61 applied to data Thus the Markov 

kernel M n is in fact a deterministic map, i.e. given y^ , , it defines a one point measure 



on R 2m concentrated in yy^\ K m (f m (y^))y^j . Thus the law M n (Pf }Jn <g> P/, m ) is the 
joint law of yj™^ and K rn (f m (y^ n ^))y^ n ^ under /. The latter we split up into the marginal law 
of y} m \ i.e. Pf >m and the conditional law of if m (/m(yi ))?/2 given y[ m) ; write P^ m |yi m) 



for the latter. We have 



Pf, m \vT } = Nn (0,KT m (f)K') for K = K m (f m (y { ^)) 



Now clearly 



H 2 (p ftm ® P f>m , M n (P f>m ® P /jm )) = i^tf 2 (p />TO , Pgjj/f 

(m) 



(4.13) 



where Ef is taken wrt y\ under P/, m - Define 



P/,m := y£ 



/m(y) - / + /m(y) - / 



2,1/2 



By definition of S m (/o) (cf. (|4.ip ) we have / G S m (/ m (y)) if y G E>f, m - Thus Lemma |4~T1 
implies 



sup H z [P Lm ,P« m \y) =o(l). 



(4.14) 



Moreover by Proposition 14.61 

P/,m = uniformly over / G £. 

Hence 

i^tf 2 (P/,m,Pg n |y{ m) ) = / H 2 (Pf, m ,lf m \y) dP f)m (y) + o(l) 

= o(l)P Lm (B Lm ) + o(l) = o(l) (4.15) 

uniformly over / G E. In conjunction with (I4.13|) the last relation proves the claim. ■ 

The next result is entirely analogous if we replace the estimator f m based on data y^ by 
the one based on data y( m ) and formally reverse the order in the product P/,m ® Pf,m- 
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Proposition 4.8 We have £* n ^ £* n . 

Proof. We construct a sequence of Markov kernels M n such that 

sup if 2 (P f>m ® Pf >m ,M n (Pf, m ® Pf,m)) -» 0. 

Define M n as follows: given yj 7 ™'* and y^™^ an d A a measurable subset of R 2m , set 

where f m is the estimator defined in the previous subsection, applied to data '• Analo- 
gously to (|4.14p we have 

Pf,m ( B f,m) = uniformly over / G £. 
A reasoning as in (|4.15|) completes the proof. ■ 

For the experiment £% n which consists of product measures Pf m <8> Pf,mi we can invoke 
Proposition l3.2l applying the equivalence map given there componentwise (i.e. to independent 

components (jj^, y^^j in £f„)- A summary of the lower informativity bound results so far 

can thus be given as follows. For r n = 2 [log(n/2)] define the lower bracket Gaussian scale 
experiment £^ n by 

£l,n ■= £(n-r n )/2 ® £(n-r n )/2- ( 4 -16) 

Corollary 4.9 Consider experiments £ n and £^ n given respectively by il.3\) and h4-lty > &3-4\) 
with parameter space S = £ Qj m where M > 0, a > 1/2. Then as n — > oo 

4.5 Bracketing the Gaussian scale model 

The proof of Theorem II. H is complete if the lower and upper informativity bounds £^ n and 8 un 
coincide in an asymptotic sense. Since we already established the relation £[^ n ^ £n ^ £u,n 
(Corollaries 13.31 14~§|) . it now suffices to show that £ u>n ^ £^ n . This essentially means that in 
the special nonparametric regression model £ n of Gaussian scale type, having r n additional 
observations does not matter asymptotically. "Additional observations" here refers to an 
equidistant design of higher grid size. The problem of additional observations for i. i. d. models 
has been discussed by Le Cam (1974) and Mammen (1986) under parametric assumptions. 
For nonparametric i. i. d. models, one can use the approximation by Gaussian white noise or 
Poisson models to bound the influence of additional observations. For simplicity, consider a 
Gaussian white noise model on [0, 1] 

dZ t = f(t)dt + n' l l 2 dW u t G [0, 1], / G S 
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with parameter space S. Consider this experiment J- n , say and also J- n j rTn . Multiplying the 
data by n 1 / 2 gives an equivalent experiment 

dZl = n 1/2 f(t)dt + dW t , t € [0, 1], / G S 

1 /2 

and the corresponding one for (n + r n ) 1 . Now, for given /, the squared Hellinger distance 
of the two respective measures is bounded by 

C^n + r^-n^yWfWl 

= C i(l+ (l))||/|| 2 
n 

if r n = o(n). Thus if r n = o(n 1 / 2 ) and supj gS ||/|| 2 < C then we have T n ~ J : n+Tn . 

Comparable results can be obtained for nonparametric i. i. d. and regression models if these 
can be approximated by J- n . In the present case, conversely, for the nonparametric Gaussian 
scale regression £ n , a result of type £ n £ n +r„ is a prerequisite for the Gaussian location 
(white noise) approximation. Note that for a narrower parameter space, given by a Lipschitz 
class, the white noise approximation of £ n has been established (cf. Grama and Nussbaum, 
1998). 

Remark 4.10 The relation 

4n 3 £n 3 L,n (4.17) 

has been proved under the technical assumption that n is uneven. If n is even, note first that 
£n-i ^ £n ^ £n+i (omitting one observation from £ n +\ and £ n ) and apply (j4.17j) to obtain 

The relation £ U)Tl ^ £; jn which will be proved for uneven n in the remainder of this section is 
easily seen to extend to £ u ,n+2 ^ £i,n- This suffices to establish the main result Theorem ll.il 
for general sample size n — > oo. 

4.5.1 First part of the bracketing argument 

Denote again m = (n — r n )/2 where r n = 2 [(log re) /2] + 1. 

Lemma 4.11 For £^ n = £ m <g) £ m we have 

£ m §3 £m ~ ^2m • 

Proof. Note that the measures in £ m ® £ m are product measures, which can be described, 
after a rearrangement of components, as 

TO 

:= (g) (iV(0, J i>m (/)) JV(0, J i)OT (/))) 
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whereas the measures in £om are 



2m 

rn 



Q 2 ,m := CS) ( N & J^j-lMf)) ® ^(0, <%,2m(/))) • 

Now Lemma 12 . 1 1 yields 

m 

H 2 (Qi >m , Q2,m) < C ^ f(^2j-l,2m (/) " Jj,m(f)Y + fem(/) " -W/))' 
i=i 

Define a partition of (— vr,7r) into n intervals Wj, n , j = 1, ... , n of equal length and for any 
/ <E L 2 (-7r,7r), let 

n 

In = Y. J 3M)^W 3 , n (4.18) 

be the ^-projection of / onto piecewise constant functions wrt the partition. Note that we 
have 

1 1 /2m - fmWl = ^2 ((^2j-l,2m(/) ~ Jj,m(f)) 2 + (J2j,2m(f) ~ Jj,m(f)) 2 ^) 

171 3=1 

so that 

# 2 (<5l,m,Q2,m) < Cm || / 2m - /m||2 < Cm - /2m||2 + ||/ ~ fm\\l) ■ 

The result now follows from 

supm||/-/ m ||2^0. (4.19) 
which is a consequence of Lemmas 15.31 and 15.51 ■ 

4.5.2 Second part of the bracketing argument 

In view of £ % m = £ n -r„ , our next aim is to show 

F w F 

where r n does not grow too quickly. Previously we defined r n = 2[(logn)/2] + 1, but we will 
assume more generally now that r n = o(n 1 / 2 ). 

Consider the gamma density with shape parameter a > 

g a (x) = ^—-x 11 - 1 exp(-x), x>0 
T(a) 

where T(a) is the gamma function, and more generally the density with additional scale 
parameter s > 

9a, s {x) = ^7rs"V- 1 exp(-xs- 1 ), x > 0. 
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We will call the respective law the F(a, s) law. Clearly if X ~ r(a, 1) then sX ~ T(a, s). It is 
well known that T(n/2, 2) = x 2 an d that the following result holds. Assume X ~ T(a, s) and 
Y ~ r(6, s); then X+Y, X/ (X+Y) are independent random variables, and X+Y ~ T(a+b, s) 
while X/(X + Y) has a Beta(a,6) distribution (Bickel and Doksum (2001), Theorem B.2.3, 
p. 489). 

Furthermore, for fixed a > consider the family of laws 

(r(a,s),s>0). (4.20) 

Clearly this is a one parameter exponential family; the shape of this exponential family 
implies that in a product family 

(T® n (a,s),s > 0) 

with n i.i.d. observations X±, . . . ,X n , the sum X^=i ^ s a sufficient statistic. This suffi- 
cient statistic has law T(na,s); hence for any subset S C (0, oo) we have the equivalence of 
experiments 

(r® n (a, s),s£S) ~ (r(na, s), s£S). (4.21) 
Lemma 4.12 For all a > and /or s, t > 

2 (i>, s), r(o, *)) = 2 ^i - ^i - (gl/2 g ^ /2) ) ) . 

Proof. We have 

« 2 (r(ti, 5 ),r(o,t)) = 2('i-y i ,J/, 2 (x) s ^ 2 (x)<fa) . 

With a substitution ^ = :c (^ + ^) this becomes 

1 f 50 f2s 1 / 2 t 1 ^ " 



y I — - I n exp(-u)du 



_ / V/y/ 2 V _ / (* 1/2 - 1 1/2 ) 

" \ s + t J " I s + t 



2\ a 



Lemma 4.13 We have, for all s > and a, 6 > 

V (r(a)r(ft)) 1 / 2 ; 
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Proof. In this case 



V*Mn}/ 2 MJ* = 1 / x (a+b)/2-l s -(a+b)/2 gxp (_ xs ^\ dx 



<° {x)9 ^ {x)dx = rV2( Q)r i/ 2 (5) J 

r((a + 6)/2) 

r i/2 ( )rva(6)" 



In E n we observe (cp. (I3.4|) 



Z 3 = J j,n (/)&> i = I,--.," 



for independent standard normals £j, which by sufficiency is equivalent to observing 
Jj,n(f)£,j- Thus £ n is equivalent to 



«n,i : = I QS)r(l/2,2J iin (/)), / G £ | . (4.22) 

Set again m — n — r n . The above experiment in turn is equivalent, by the sufficiency argument 
for the scaled gamma law invoked in (|4,21|) . to 

n 



£n,m ■= \ ($r® ro (l/2m,2J ijn (/)), /€ E 



Analogously we have 



£m ~ £m,i ~ £m,n := | (^r^(l/2n, 2J j>m (/)), / G £ J . (4.23) 



Introduce an intermediate experiment 

(m 
®r®»(l/2m,2J i)m (/)), /g£ 

Lemma 4.14 VFe Ziaue f/ie totoZ variation asymptotic equivalence 



Proof. Write the measures in £ nm as a product of mn components, i.e. as <8>2JiQi,i where the 
component measures are defined as follows. For every i = 1, . . . , mn, let be the 

unique index j G {1, . . . , n} such that there exists k G {1, . . . , m} for which i = (j — l)m + /c. 
Then 

Ql,i ■= r(l/2m,2J j(l i)in (/)), i = 1, .. . ,mn. 

Analogously, let j(2,i) be the unique index j G {l,...,m} such that there exists fc G 
{1, . . . ,n} for which i = (j — l)n + A;. Then the measures in n are a product of mn 
components, i. e. are <S>^™(32,i where 

<32,i = r(l/2m, 2J J -( 2)i ) im (/)), i = 1, . . . , mn. 
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Then the Hellinger distance between measures in £n,m and E mn is, using Lemma 2.19 in 
Strasser (1985) and then Lemma f4. 121 



ran ran 



1=1 



4£ 



H 2 (g)Qi,i, ® ^2,i ) < 2 ^ # 2 (Q M , Q 2 
Vi=l i=l 

/ / 

1 - 



(4.24) 



1=1 



1/2 



2\ V 2 "A 



Jj(l,i),n(f) + Jj(2,i),m(f) 



By using the inequality 



( s i/2_ t i/2y 
s + t 



(s - ty 



(s + t)(sV2 +il/2)S 



< 



and observing that for / £ S, we have Jjn(f) — M , we obtain an upper bound for (I4.24|) 

mn / l/2m\ 

4 E 1 " ( X " M ' ( J i(M),"(/) " ^(2,), m (/)) 2 ) m • (4.25) 
i=l ^ ' 

The expression </j(i j j) jra (/) — Jj(2,i),m(f) can be described as follows. For any a; 6 (^n> mn)' 
i = 1, . . . , mu we have 



Jj(l,i),n{f) ~ Jj(2,i),m\f) ~ fn( x ) ~ fm\ x )- 
where f n is defined by (|4.18ft . Now as a consequence of Lemmas 15.41 and 15.5 



sup /„ - f m \\ < sup /-/„ + sup / - f m \\ = o(l). 
/es" "°° /es" 1100 /es" 



(4.26) 
(4.27) 



Note that for m — > oo and z-*0we have 



'1 - 



2\ l/2m 



<X1> 1 2^ 1 ° g ^ 1_C ' Z 



= ^P (-^ (C. 2 + 0(z 4 ))) = 1 - J- (Cz 2 + 0(, 4 )) + o 
Thus from (I4.25|) we obtain in view of f|4.2T|> 

(ran ran \ mn - 

(g)Ql,i, ®Q 2> i < - (^(l-0,n(/) " ^(2, i ), m (/)) 2 (1 + O(l)). 

i=l i=l / i=l 

As a consequence of (I4.26|) we obtain 

mn ^ 

\\fn ~ fra\\ 2 = ^ — (Jj(l,i),n(f) ~ Jj(2,i),ra(f)) 

which implies 

H 2 ((g)QM,(g)Q2 



1=1 



mn mn 



< Cn ||/ n - / m ||^ < Cn 11/ - f m \\ 2 + Cn 11/ - f n \\l 



\i=i i=x j 
Now as in (14.190 this upper bound is o(l) uniformly over / € S. 
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Lemma 4.15 We have the asymptotic equivalence 

Proof. We know (cf. that £ min ~ £ m ,l where 



£m,i = | Q9r(i/2,2J i)OT (/)), /€ s | . 



Analogously, using (14.211) again, we obtain 



£* m ,n ~ := | ($r(n/2m,2J,-, OT (/)), / G S 

For given / G E, the Hellinger distance between the two respective product measures is 
bounded by (using Lemma 2.19 in Strasser (1985) and then Lemma l4.13p 

*£* (r (1 /2. 2 ,4„(/)),r(„/ 2m , 2 ,„„(/))) = <£(,- (r( r i % 4 r ;^ ) ■ 

Note that this bound does not depend on / 6 S. Write n/m = 1 + 5 where 5 = r n /m; the 
above is 

l " (r(i/2)r(i/2 + 5/2)f/ 2 - r(i/2 + 5/4) 
£j (r(i/2)r(i/2 + 5/2)) 1 / 2 

The Gamma function is infinitely differentiable on (0, oo); by a Taylor expansion we obtain 

r(i/2 + 5/4) = r(i/2) + r'(i/2)| + o(5 2 ), 
^(1/2 + 5/2) = r 1 / 2 (l/2) + ir- 1 / 2 (i/2)r'(i/2)| + 0(5 2 ). 

Consequently 

(T(l/2)r(l/2 + 5/2)) 1/2 - r(l/2 + 5/4) = 0(5 2 ) 
so that (I4.28P becomes 

^r(l/2)(l + o(l)) K) ~ m { ' 

The condition r n = o(n 1//2 ) now implies that this upper bound is o(l). We thus established 
total variation asymptotic equivalence £ m ,i — £ m ,i- H 
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5 Appendix: auxiliary statements and analytic facts 



5.1 Proof of Lemma 12.11 

Consider the spectral decompositions of A and B: 

A = CiAiCi, B = C 2 A 2 C 2 

where Aj are nxn diagonal matrices and Cj are orthogonal matrices. Recall the simultaneous 
diagonalization of A and B: setting D = A 1 C 1 , we obtain 

DAD' =I n , B := DBD' 

and letting B = CAC be the spectral decomposition of -B, we obtain with D := C'D 



We now claim that 



DAD' = I n , DBD' = A. 



< M 2 IIA-5II 2 



In "A 



tr 
tr 
tr 



In -A 

Indeed we have 

[(J n -A)(J n -A) 

(DA& - DBD'^j [DAD' - DBD' 
D'D (A — B) D'D (A — B) 

Now for eigenvalues A max (-) we have 

A max (&D) = A max (£)£)') = A max (C'DD'C) = A max (DD') 
= A max (dA- 1/2 A- 1/2 C() = A-[ n (A) < M, 

hence 



In -A 



D'D (A -B)(A- B) 



< M tr 

< M 2 tr [(A - B) (A - B)] = M 2 \\A - B\\ 2 
so that (15. ip is proved. Similarly to (|5.2I) we obtain a bound from below 

A min (^'£>) =A m L(A)> A/" 1 
which yields analogously to (j5.1h 



/ n -A 



> M~ IIA - 51 



(5.1) 



(5.2) 



(5.3) 
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Consider now the Hellinger affinity Ajj(-, •) between the one dimensional normals N(0, 1) and 
N(0, a 2 ): if ip is the standard normal density then 



A H (N(0,l),N(0,a 2 )) = a- 112 / ip l t 2 {x)ip l ! 2 (xa- l )dx 



2(7 



1/2 



1 



(1-*)* 

I -I- a 2 J V' 1 • <7 2 

2^2 \ 1/2 

ii — < i 



2\ 1/2 



(1-a 2 ) 5 



Let h = a 2 — 1; then as /i — ► 



(l + a) 2 (l + a 2 ; 



ft 2 



log^(7V(0, 1), N(0, a z )) = -— (1 + o(l)). 

lb 



(5.4) 



The matrix .D is nonsingular, and since the Hellinger distance is invariant under one-to-one 
transformations , 



H z (JV„(0, A), JV n (0, 5)) = H z ( N n (0, I n ),N n (0, A) 



2 (l -^ H (Ar n (0,/ n ),iV n (0,A))J = 2 |l -j[A H (N(0,l),N(0,\i))j 

2 (l-exp ( ^ log (a h (N(0, 1), iV(0, At))) J J (5.5) 



vi=l 



where Aj, i = 1, . . . , n are the diagonal elements of A. Let us assume that \\A — B\\ < e —* 
where the dimension n of A, B may vary arbitrarily. Since 



sup 
i=l,...,n 



1 - A, 



< 



t=i 

we may write, in view of (|5.ip and (|5.4p 

log (A H (N(0,l),N(0,k)) 
where sup i=1 n \pi\ — > as e — > 0. Since 



4-A 



1 / ~ \ 2 

-(l-A.) (l+ ft ) 



EM 



i=l 



< sup ipii j ( i - a. 



we obtain for e — > 



-16^1og(^(A^(0,l),iV(0,At))) =E( 1 "^) + 



i=i 



8=1 



2 

^(l-At) +o(l)^(l-At 

i=l i=l 



In "A 



(l + o(l)) 
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and as a consequence from ([5.5 



7/-(.Y„(<>..l)..Y„<().Z?)) -2(1 - <>xp j 



/n-A 



(l + o(l)) 



A 



:i+o(i))- 



In conjunction with (|5.ip and (15.30 . the last relation proves the lemma. 



5.2 An auxiliary result for the proof of Lemma 14.11 

Let A, B be two n x n covariance matrices. Recall that for every covariance matrix A 
there is a uniquely defined symmetric square root matrix A 1 / 2 : if A = DAD T is a spectral 
decomposition (D orthogonal, A diagonal) of A then A 1 ! 2 = DA 1 / 2 D T . 



Lemma 5.1 Let A,B be two n x n covariance matrices. Then 



A l/2 _ B X/2 



Amin^ 1 / 2 + B 1 ' 2 ) < \\A-B\ 



Proof. Observe that 

^1/2 _ B l/2\ B l/2 + A l/2 ^1/2 _ B l/2^ =A _ B 
^1/2 _ gl/2\ A l/2 + B l/2 ^1/2 _ B l/2^ =A _ B 

Add up the two equations and set S= {A 1 / 2 + B 1 / 2 ) ; D = (A 1 / 2 - B 1 / 2 ); then 

DS + SD = 2 {A - B) . 
Take the squared norm ||-|| 2 on both sides and observe 

\\DS + SD\\ 2 = tr [(DS + SD){DS + SD)} 



2tr [DSSD] + 2tr [DSDS] 



Clearly we have 



ti[DSSD] > (A min (,S)) 2 tr [DD] 



tr [DSDS] = tr 



S 1/2 DSDS l/2 



> A min (S')tr 



S 1/2 DDS 1/7 



> (A min (5)) 2 tr [DD] . 



The last two displays imply 



\DS + SD\\ 2 > 4{X min {S)) 2 \\D\\ 2 



which in conjunction with (|5.6|) yields 

\\D\\X mhl (S) < \\A-B\\. 



(5.6) 
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5.3 Besov spaces on an interval 



Let / be a function denned on I = [0, 1] and for < h < 1 define 

-l-h 



II A h /||j; := / \f(x)-f(x + h)\ p dx for 1 < p < oo, 

JO 

1^/^ = snp \f( x )-f( x + h)\. 

0<x<l-h 

For 1 < p < oo, the modulus of smoothness is defined as 

w(f,t) p := sup \\A h f\\ 

0<h<t 

For < a < 1 and 1 < q < co define a Besov type seminorm \f\ Ba by 

|/| B « = supw(/,t) p for g = oo 

p - 9 t>i 

and a norm ||/|| BQ by 

ll/ll S «/=ll/llp + l/l Bp% - 
The Besov space Bp (for l<p<oo,0<a<l)is the set of / where ||/|| B a < oo, equipped 
with the norm ||-|| Ra • Define also the Holder norm 

ii/ii c . == \\f\L+^p lfi f~ f J y)l (5.7) 

x^y \ x y\ 

and the corresponding Holder space C a . For two different spaces, B and B' say, an embedding 
theorem (written B 2?') is a norm inequality 

ll/ll^<C||/|| B 

where C depends on B', B. Thus the embedding implies the set inclusion B C B'. We cite 
the basic embedding theorem for our case, which is obtained by combining Theorems 18.4 
, 18.5 18.8 in Besov, E'in and Nikol'skii (1979) with Theorems 3.3.1 and 2.5.7. in Tricbcl 
(1983), for the special case of a domain [0, 1]. 

Proposition 5.2 Let < oi < a < 1 and 1 < p,q,p',q' < co. Then 

i) if q < q' then 

idol r>a 

&p,q ^ &p,q> 

ii) if p < p' and a — (1/p — l/p') > then for a' = a — (1/p — 1/p') 
Hi) if p > p' then 



&p,q ^ & p >, q 



p,q p',q'' 



iv) we have B^ ^ = C a in the sense of equivalence of norms: 



B«^C a andC a ^B« 
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Approximation by step functions. Consider a partition of [0, 1] into n intervals Wj tTl 
j = 1, . . . , n of equal length and for any / S £2(0, 1); let f n be the .^-projection onto the 
piecewise constant functions, i.e. 



n . 

fn = y~] Jj,n{f)^-Wj n , where Jj, n (f) = n / /(x)da 

7 = 1 -/Wjn, 



Lemma 5.3 For < a < 1 and / G i?2 2 we /lave 



\f-fn\\ 2 2 <4n- 2a \f\* 



Proof. Note first 



j = l "3,n 

For any interval (a, b) and e = b — a we have 

2 /■& / /•& 



/ (f(x)-e- 1 [ f(u)dv) dx= [ (e- 1 f (f(x)-f(u))dv) dx 

Ja V J a / Ja V Ja J 

f b -1 f b 2 

(by Jensen's inequality) < / e / (f(x) — f(u)) dudx 

J a J a 



I?" 1 t I* (f(x)-f(u)) 2 dudx. 



With a change of variable h = u — x the above equals 



2e~ l J j (f(x)- f{x + h)f dhdx 



W n ,n JW„, n JO 



(./(,') -J„.„(/))^/,<2»-^ / / J£ 1 +h)y dh ( i.r. 



(5. 



f-fn\\l = Y, ~ J ^(/)) 2 ( 5 - 9 ) 

• 1 ./W,\„ 



/a ^0 n 

Setting now (a, 6) = Wj, n , b = j/n and e = n _1 we obtain for j = 1, . . . , n — 1 

/ (/(x)-J Jin (/)) 2 dx<2n- 2 « / ~it + k))2 dhdx (5.10) 

whereas for j = n we have only the bound (|5.10p . i.e 
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Hence from (I5.9P by adding the upper bounds 

n-1 



2 dx, 



l|/-/n||2 = E / (f(x)-Jj,n(f)fdx+ [ (f(x) - J n , n (f)) 

£ / (/(-) " J ^f)f dx < 2n-- [ 1/n /" 1/n ^-Jj^^ dxdh 
~[Jw j} „ In .In n 



JO 

1/n 



/•l/n 

<2n~ 2a / \\A h ff 2 h-( 2a+1 Uh 
Jo 



and 

/ (f(x) - Jn,n(/)) 2 dx < 2n- 2a f ~J}* + k))2 dhdx. (5.11) 

Now set 

5 (x,fc) = (/(z)-/(x + /0) 2 /r( 2Q+1 \ 

A = {(x,h) :0<h< l/n,0 < x < 1 - h} . 

Then 

/•1/n /• 

/ \\A h f\\ 2 2 h^ 2a+ ^dh = / g(x,h)d(x,h) 
JO J A 

and the second integral in (|5.1ip can be written in the same way but over a domain 

A* = {(x,h) :0< h< 1-x, l-l/n<x< 1} . 
Since A* C A and 5(2;, /i) > 0, we obtain 

g(x,h)d(x,h) + 2n~ 2a / g(x,h)d(x,h) 
J A* 

/■1/n 

<4n~ 2a / IIAft/H 2 ^-^- 1 - 1 )^ < 4n" 2a \f\] 
Jo 



\f-fn\\l<2n- 2a 



2.2 



Lemma 5.4 For 1/2 < a < 1 anc? / £ i?2 2 1,76 ^ ave 

11/ -/nil <C Q n 1 / 2 ~ Q 

Proof. For < /3 < 1, consider the Holder space with norm H/H^/3 (cf. (15. 7j> ). For 
/ 6 C^, the result 

||/-/n|L<n^ ||/|| c , 
is immediate. By Proposition 15.21 (ii).(i) and (iv), we have the embeddings 

Bh - B^ 2 ^ B^U 2 - (5.12) 
Setting (3 = a — 1/2, we obtain the result. ■ 
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Periodic spaces. For any / £ £2(0, 1) and < a < 1, let ||/|| 2 a be the norm denned in 
terms of Fourier coefficients analogous to (|2.14p . i.e. 



00 „i 
:=7/(0)+ b1 2Q 7/(i), where 7/(j) = / exp(2irijx)f(x)dx. 

■ J 



j=-oo 



2 

2.Q 



Let W a be the set of / where ||/|| 2q < c>o equipped with this norm. This is the periodic 
version of the Besov-Sobolev space 5 2 2 (thus a standard notation for W a would be B^ 2 ); 
we will prove one part of this claim via the embedding below. For a more comprehensive 
treatment cf. Triebel (1983), Theorem 9.2.1. 

Lemma 5.5 For < a < 1 we have 



'2,2- 

Proof. We will first establish the inequality 

/ 00 

% <q <Ca ^n 2a "V(/,n- 1 ) 2 + 



(5.13) 



\n=l 



To this end, note that for h > 1 we have co 2 (f, h)2 = w 2 (/, 1)2 and therefore, by integrating 
over intervals ((n + l) _1 ,n _1 ) 

f (^) ? , f> + .>~w ^ (^) + s; ? 

which in view of 1)| < 4 H/H2 gives a bound 

00 /-CO 

|/|| < 2 2 « ^ n 2 «" V(/, n" 1 ), + 4 ll/H 2 / t'^ 

n=l Jl 



l Ut. 



and thus (|5.13p . Define a periodic version of || A^/|| 2 by first extending the function / outside 
[0, 1] periodically, and then setting 



2 Jo 

The periodic modulus of smoothness is then 

&(f,t)2 := sup 

0<h<t 



\f(x)-f(x + h)\ 2 dx. 



(5.14) 



Evidently we have 2)2 < i)2 for all t > 0. Now 



= S T/OOIl-expfe-/!)! 2 = 4 J] 7/0') sin 

j=—oo j'=— oo 

< E^(J)A 2 + 4^ 7 2 (J ). 

|j|<n j|>ra 



2 | j& 

2 
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Consequently 



n=l 



71=1 



Ul<n n=l |j|>n 

oo 



n=l 
oo 



2 

2.Q 



= E 7?(;); 2 E« 2Q - 3 + 4 E -y?o-)E» 5ta - 1 

j=—oo n>\j\ j=—oo n<\j\ 

oo oo 

< c * E 7/(j)lil 2a + c a £ ^(i)UI 2a < c a 

j=—oo j=—oo 

which in conjunction with (|5.13|) proves the claim. ■ 
For any / G ^(0, 1) (real- valued) and uneven n, let 

fn(x)= ^2 7/(i)exp(27rijx) 
b1<(n-l)/2 

be its truncated Fourier series. The letter C denotes generic constants depending on a but 
not on /. 



Lemma 5.6 For 1/2 < a < 1 we have 

w a ^ C -«-l/2 j 



/-/„ 



2,Q 



Proof. The first relation follows from Lemma 15.51 and the embedding (I5.12|) . The second 
then follows from the Cauchy-Schwartz inequality via 



/ fn 



z E i 2a m E r 



2 1 <II/IIL Cn 1 - 2 *. 



Ji|>(n-l)/2 



|i|>(n-l)/2 



Let (jj >n be the midpoint of the interval Wj >n j = 1, . . . ,n (cf. (15. 8p ). 



Lemma 5.7 For 1/2 < a < 1 and / G i?2 2 ^ ^a^e along uneven n 



E \Jn{Uj,n) ~ Jj,n(f)) < Ca U 1 ' 



2a || jr||2 

l2,a 
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Proof. Note that 



E [fn(Vj,n) ~ Jj,n(f)) ^ 



3=1 

n 

< 2j2(jj,n(fn) ~ Jj,n(f)Y + 2 E " J jAfn)] 

3=1 3=1 J 

Here by Parseval's relation and the projection property of f n the first term is bounded by 

2 



2n 



f fn 



Thus it remains to show that 



E {fn{uj,n) ~ Jj,n(fn)J < C a n 1 ' 

3=1 



2a \\ f \\2 

2,a 



(5.15) 



We have 

fn(^j,n) Jj,n{fn) 



< n 



< n 



< n 



w, 



ir, 



-1/2 



fn(x) - f n (Uj,n) 



dx < n 



Df n (t)dt 



\x — OJ 



3M 



1 1/2 



2,n 

2 \ V2 



dx 



Df n (t) dt) dx 



1/2 



W,. 



Df n (t) dt 



Consequently 



E (/»M - 4n(/n)) 2 < n- 1 ^ ^ (z>/„(i 



dt. 



i= i ^[o,i] 
By termwise differentiation and Parseval's relation the right side equals 



n 



1 E A?(i)= E ( bf^tf) 



|j|<(n-l)/2 



< n 



|j|<(n-l)/2 
1-2q II jr||2 



2,a 



which establishes (j5.15j) . 



Remark 5.8 Periodic Besov spaces Bp q . These can be defined for < a < 1 and 1 < 

Pi q < oo analogously to the spaces as above, using an periodic increment norm A^/ 

and a periodic modulus of smoothness £j(f,t) v defined analogously to (|5.14p . Clearly then 
Bp q Bp q . An intrinsic characterization in terms of Fourier coefficients Jf(k) is as follows: 
for p > 2 and 1/2 < a < 1 the expression 

1/9 



2 jaq 

3=0 



7/(/c) exp(27rifcx) 

2J- 1 -K|fc|<2J 



is an equivalent norm in cf. Nikolskii, sec. 5.6, relation (6) (cp. also Triebel (1983), 

definition 2.3.1/2). . 
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